{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f1efabbe-689f-4908-822f-1ff59834cba7",
   "metadata": {},
   "source": [
    "# Optional Lab: Logistic Regression, Decision Boundary\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e36e717-2787-460b-97a2-45d7b08fbfe7",
   "metadata": {},
   "source": [
    "## Goals\n",
    "In this lab, you will:\n",
    "- Plot the decision boundary for a logistic regression model. This will give you a better sense of what the model is predicting.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "486371c2-c0b2-46d2-a6a1-492f52f1e2aa",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "%matplotlib widget\n",
    "import matplotlib.pyplot as plt\n",
    "from lab_utils_common import plot_data, sigmoid, draw_vthresh\n",
    "plt.style.use('./deeplearning.mplstyle')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f88646eb-e20e-4736-83cc-2a583e328742",
   "metadata": {},
   "source": [
    "## Dataset\n",
    "\n",
    "Let's suppose you have following training dataset\n",
    "- The input variable `X` is a numpy array which has 6 training examples, each with two features\n",
    "- The output variable `y` is also a numpy array with 6 examples, and `y` is either `0` or `1`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fa08bca1-754e-4b7e-b21c-a8c5302f2ffd",
   "metadata": {},
   "outputs": [],
   "source": [
    "X = np.array([[0.5, 1.5], [1,1], [1.5, 0.5], [3, 0.5], [2, 2], [1, 2.5]])\n",
    "y = np.array([0, 0, 0, 1, 1, 1]).reshape(-1,1) "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b9e7776-c435-483a-85e9-b6f7b8957d78",
   "metadata": {},
   "source": [
    "### Plot data \n",
    "\n",
    "Let's use a helper function to plot this data. The data points with label $y=1$ are shown as red crosses, while the data points with label $y=0$ are shown as blue circles. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3fc2e7d-5fc5-448b-8a1c-cafa5e5edb4a",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig,ax = plt.subplots(1,1,figsize=(4,4))\n",
    "plot_data(X, y, ax)\n",
    "\n",
    "ax.axis([0, 4, 0, 3.5])\n",
    "ax.set_ylabel('$x_1$')\n",
    "ax.set_xlabel('$x_0$')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b93e5bdc-f2dc-4122-905e-45c757c2d837",
   "metadata": {},
   "source": [
    "## Logistic regression model\n",
    "\n",
    "\n",
    "* Suppose you'd like to train a logistic regression model on this data which has the form   \n",
    "\n",
    "  $f(x) = g(w_0x_0+w_1x_1 + b)$\n",
    "  \n",
    "  where $g(z) = \\frac{1}{1+e^{-z}}$, which is the sigmoid function\n",
    "\n",
    "\n",
    "* Let's say that you trained the model and get the parameters as $b = -3, w_0 = 1, w_1 = 1$. That is,\n",
    "\n",
    "  $f(x) = g(x_0+x_1-3)$\n",
    "\n",
    "  (You'll learn how to fit these parameters to the data further in the course)\n",
    "  \n",
    "  \n",
    "Let's try to understand what this trained model is predicting by plotting its decision boundary"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37be429c-f292-4fce-a093-e50b7746af50",
   "metadata": {},
   "source": [
    "### Refresher on logistic regression and decision boundary\n",
    "\n",
    "* Recall that for logistic regression, the model is represented as \n",
    "\n",
    "  $$f_{\\mathbf{w},b}(\\mathbf{x}^{(i)}) = g(\\mathbf{w} \\cdot \\mathbf{x}^{(i)} + b) \\tag{1}$$\n",
    "\n",
    "  where $g(z)$ is known as the sigmoid function and it maps all input values to values between 0 and 1:\n",
    "\n",
    "  $g(z) = \\frac{1}{1+e^{-z}}\\tag{2}$\n",
    "  and $\\mathbf{w} \\cdot \\mathbf{x}$ is the vector dot product:\n",
    "  \n",
    "  $$\\mathbf{w} \\cdot \\mathbf{x} = w_0 x_0 + w_1 x_1$$\n",
    "  \n",
    "  \n",
    " * We interpret the output of the model ($f_{\\mathbf{w},b}(x)$) as the probability that $y=1$ given $x$ and parameterized by $w$ and $b$.\n",
    "* Therefore, to get a final prediction ($y=0$ or $y=1$) from the logistic regression model, we can use the following heuristic -\n",
    "\n",
    "  if $f_{\\mathbf{w},b}(x) >= 0.5$, predict $y=1$\n",
    "  \n",
    "  if $f_{\\mathbf{w},b}(x) < 0.5$, predict $y=0$\n",
    "  \n",
    "  \n",
    "* Let's plot the sigmoid function to see where $g(z) >= 0.5$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5a50e78c-8f4c-41ec-8820-b4c25c4b405a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot sigmoid(z) over a range of values from -10 to 10\n",
    "z = np.arange(-10,11)\n",
    "\n",
    "fig,ax = plt.subplots(1,1,figsize=(5,3))\n",
    "# Plot z vs sigmoid(z)\n",
    "ax.plot(z, sigmoid(z), c=\"b\")\n",
    "\n",
    "ax.set_title(\"Sigmoid function\")\n",
    "ax.set_ylabel('sigmoid(z)')\n",
    "ax.set_xlabel('z')\n",
    "draw_vthresh(ax,0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f3c1272-e43c-4f6a-9a51-c08dbcb0cb61",
   "metadata": {},
   "source": [
    "* As you can see, $g(z) >= 0.5$ for $z >=0$\n",
    "\n",
    "* For a logistic regression model, $z = \\mathbf{w} \\cdot \\mathbf{x} + b$. Therefore,\n",
    "\n",
    "  if $\\mathbf{w} \\cdot \\mathbf{x} + b >= 0$, the model predicts $y=1$\n",
    "  \n",
    "  if $\\mathbf{w} \\cdot \\mathbf{x} + b < 0$, the model predicts $y=0$\n",
    "  \n",
    "  \n",
    "  \n",
    "### Plotting decision boundary\n",
    "\n",
    "Now, let's go back to our example to understand how the logistic regression model is making predictions.\n",
    "\n",
    "* Our logistic regression model has the form\n",
    "\n",
    "  $f(x) = g(-3 + x_0+x_1)$\n",
    "\n",
    "\n",
    "* From what you've learnt above, you can see that this model predicts $y=1$ if $-3 + x_0+x_1 >= 0$\n",
    "\n",
    "Let's see what this looks like graphically. We'll start by plotting $-3 + x_0+x_1 = 0$, which is equivalent to $x_1 = 3 - x_0$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d0d00a4-04b4-4397-8e94-9b22534f02d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose values between 0 and 6\n",
    "x0 = np.arange(0,6)\n",
    "\n",
    "x1 = 3 - x0\n",
    "fig,ax = plt.subplots(1,1,figsize=(5,4))\n",
    "# Plot the decision boundary\n",
    "ax.plot(x0,x1, c=\"b\")\n",
    "ax.axis([0, 4, 0, 3.5])\n",
    "\n",
    "# Fill the region below the line\n",
    "ax.fill_between(x0,x1, alpha=0.2)\n",
    "\n",
    "# Plot the original data\n",
    "plot_data(X,y,ax)\n",
    "ax.set_ylabel(r'$x_1$')\n",
    "ax.set_xlabel(r'$x_0$')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9dc19ca0-bb1a-4a73-b05b-6551edb8248c",
   "metadata": {},
   "source": [
    "* In the plot above, the blue line represents the line $x_0 + x_1 - 3 = 0$ and it should intersect the x1 axis at 3 (if we set $x_1$ = 3, $x_0$ = 0) and the x0 axis at 3 (if we set $x_1$ = 0, $x_0$ = 3). \n",
    "\n",
    "\n",
    "* The shaded region represents $-3 + x_0+x_1 < 0$. The region above the line is $-3 + x_0+x_1 > 0$.\n",
    "\n",
    "\n",
    "* Any point in the shaded region (under the line) is classified as $y=0$.  Any point on or above the line is classified as $y=1$. This line is known as the \"decision boundary\".\n",
    "\n",
    "As we've seen in the lectures, by using higher order polynomial terms (eg: $f(x) = g( x_0^2 + x_1 -1)$, we can come up with more complex non-linear boundaries."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "756bd4f9-2f42-460c-9ed4-b2b5e842850b",
   "metadata": {},
   "source": [
    "## Congratulations!\n",
    "You have explored the decision boundary in the context of logistic regression."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
